The original animation

Visual Capitalist published an animation created by James Eagle showing how smartphone vendor market shares developed over 30 years. In the center of the visualization is a donut chart displaying monthly market share values. The chart includes a legend, which repeats the share per manufacturer. Manufacturers displayed in the donut chart are highlighted in the legend.

The goal of this tutorial is to create the animated donut chart in R with {ggplot2}. We will use another data source due to availability and (for now) not create the legend.

Packages

Let’s load the packages we will use for creating the animation, especially {ggplot2} via the Tidyverse and {gganimate}.

library(tidyverse)
library(gganimate)
library(ggtext)
library(lubridate)

Get the data

Statcounter provides mobile vendor market shares back to 2010. The original animation goes back to the 1990s and uses data which are not openly available.

A peak at methodology

Statcounter gives an overview about how the data is collected in their FAQ section:

Statcounter is a web analytics service. Our tracking code is installed on more than 2 million sites globally. These sites cover various activities and geographic locations. Every month, we record billions of page views to these sites. For each page view, we analyse the browser/operating system/screen resolution used and we establish if the page view is from a mobile device.

Statcounter data is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. It can be downloaded in CSV format via https://gs.statcounter.com/vendor-market-share/mobile/worldwide/#monthly-201003-202205. After the download, place the CSV file in your project directory before loading it into the R session. We are using the maximum period available, which is March 2010 to May 2022 (as of writing this tutorial).

filename <- "vendor-ww-monthly-201003-202205.csv"
df_raw <- read_csv(filename)
## Rows: 147 Columns: 70
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): Date
## dbl (69): Samsung, Apple, Unknown, Nokia, Huawei, Xiaomi, LG, Oppo, Sony, Mo...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
n_vendors <- ncol(df_raw) - 1
head(df_raw)
## # A tibble: 6 × 70
##   Date    Samsung Apple Unknown Nokia Huawei Xiaomi    LG  Oppo  Sony Motorola
##   <chr>     <dbl> <dbl>   <dbl> <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl>    <dbl>
## 1 2010-04    2.37  33.2       0  37.6      0      0  0.23     0  8.3      0.3 
## 2 2010-05    3.27  33.1       0  37.4      0      0  0.08     0  7.96     0.12
## 3 2010-06    3.86  30.7       0  38.3      0      0  0.08     0  7.9      0.11
## 4 2010-07    3.83  30.1       0  36.8      0      0  0.21     0  7.9      0.29
## 5 2010-08    4.07  29.8       0  36.4      0      0  0.22     0  7.81     0.35
## 6 2010-09    4.53  26.7       0  38.1      0      0  0.18     0  7.84     0.4 
## # … with 59 more variables: HTC <dbl>, Lenovo <dbl>, RIM <dbl>, Micromax <dbl>,
## #   Mobicel <dbl>, Asus <dbl>, `General Mobile` <dbl>, Google <dbl>, BBK <dbl>,
## #   Vivo <dbl>, ZTE <dbl>, Alcatel <dbl>, Realme <dbl>, OnePlus <dbl>,
## #   Tecno <dbl>, Infinix <dbl>, Lava <dbl>, Gionee <dbl>, Vodafone <dbl>,
## #   Turkcell <dbl>, Wiko <dbl>, Coolpad <dbl>, Lyf <dbl>, Casper <dbl>,
## #   Itel <dbl>, Hisense <dbl>, Vestel <dbl>, Spice <dbl>, AIS <dbl>,
## #   Meizu <dbl>, bq <dbl>, QMobile <dbl>, LeEco <dbl>, Panasonic <dbl>, …

Each vendor’s market share is coded in a column. In total, the market shares from 69 vendors are available in the dataframe.

colnames(df_raw)
##  [1] "Date"             "Samsung"          "Apple"            "Unknown"         
##  [5] "Nokia"            "Huawei"           "Xiaomi"           "LG"              
##  [9] "Oppo"             "Sony"             "Motorola"         "HTC"             
## [13] "Lenovo"           "RIM"              "Micromax"         "Mobicel"         
## [17] "Asus"             "General Mobile"   "Google"           "BBK"             
## [21] "Vivo"             "ZTE"              "Alcatel"          "Realme"          
## [25] "OnePlus"          "Tecno"            "Infinix"          "Lava"            
## [29] "Gionee"           "Vodafone"         "Turkcell"         "Wiko"            
## [33] "Coolpad"          "Lyf"              "Casper"           "Itel"            
## [37] "Hisense"          "Vestel"           "Spice"            "AIS"             
## [41] "Meizu"            "bq"               "QMobile"          "LeEco"           
## [45] "Panasonic"        "True"             "Acer"             "Kyocera"         
## [49] "Reliance Digital" "Pantech"          "InFocus"          "Intex"           
## [53] "Xolo"             "Nintendo"         "Smartfren"        "Archos"          
## [57] "Blu"              "HP"               "i-Mobile"         "Condor"          
## [61] "Avea"             "Karbonn"          "dtac"             "Yu"              
## [65] "T-Mobile"         "Symphony"         "Sharp"            "Lanix"           
## [69] "Infinex"          "Other"

Transform the data

For our plot, we have to transform the dataframe into long format, i.e. each vendor has to be encoded in a row instead of a column. Since displaying all vendors would lead to a cluttered chart, we lump vendors with smaller market shares into an “Other” category. There are a couple of months with a rather high shares of “Unknown” - we group “Unknown” to other as well. All vendors with a market share below the threshold will be recoded as “Other” month by month.

threshold_for_lumping <- 3.1

df_long <- df_raw %>%
  pivot_longer(cols = -Date, names_to = "vendor", values_to = "market_share") %>% 
  # group vendors with smaller market shares to "Other" based on monthly shares
  mutate(vendor2 = ifelse(market_share < threshold_for_lumping | 
                            vendor == "Unknown", "Other", vendor),
         date = ym(Date)) %>% 
  # the data from March 2010 is incomplete, remove it
  filter(date > as_date("2010-03-01")) %>% 
  count(date, vendor2, wt = market_share, name = "market_share")

The first few rows of the transformed dataframe:

head(df_long)
## # A tibble: 6 × 3
##   date       vendor2 market_share
##   <date>     <chr>          <dbl>
## 1 2010-04-01 Apple          33.2 
## 2 2010-04-01 Nokia          37.6 
## 3 2010-04-01 Other           5.11
## 4 2010-04-01 RIM            15.9 
## 5 2010-04-01 Sony            8.3 
## 6 2010-05-01 Apple          33.1

All vendors in the new grouped variable. These vendors will appear in the chart at least in one month with a market share above the threshold:

unique(df_long$vendor2)
##  [1] "Apple"   "Nokia"   "Other"   "RIM"     "Sony"    "Samsung" "HTC"    
##  [8] "LG"      "Huawei"  "Lenovo"  "Xiaomi"  "Oppo"    "Mobicel" "Vivo"

First plot with facets

Let’s create the first basic graph, capturing the market share for each month of the first year:

df_long %>% 
  filter(date <= as_date("2011-03-01")) %>% 
  ggplot(aes(vendor2, market_share)) +
  geom_col() +
  coord_flip() +
  facet_wrap(vars(date))

Create a donut chart with ggplot2

{ggpubr} is a great package for creating several chart types without all the details of {ggplot2}, including donut charts. Since we will be making some customizations for our animation, we will create the plot from scratch in {ggplot2}, though.

Start with a pie chart

The base for the donut chart is a pie chart. Creating pie charts in {ggplot2} works just like creating a stacked bar chart in a polar coordinate system. We achieve this by adding coord_polar(theta = "y") to the plot. For this static chart we select the most recent month.

Instead of the default color palette we use the Lapras palette from the {palettetown} package.

df_long %>% 
  filter(date == as_date("2022-05-01")) %>%
  ggplot(aes(x = 1, market_share, group = vendor2)) +
  geom_col(aes(fill = vendor2), position = "fill") +
  paletteer::scale_fill_paletteer_d("palettetown::lapras") +
  coord_polar(theta = "y") +
  theme_void() # removes all theme elements

Adding the labels for each category is a bit trickier. We have to calculate the label position from the cumulative sums.

p <- df_long %>% 
  filter(date == as_date("2022-05-01")) %>%
  # calculate the label positions and format the label texts
  arrange(date, market_share) %>% 
  mutate(label_pos = cumsum(market_share) / sum(market_share) 
         - 0.5 * market_share /  sum(market_share),
         label = sprintf("%s\n%s %%", vendor2, market_share),
         vendor2 = fct_reorder(vendor2, -market_share)) %>% 
  ggplot(aes(x = 1, market_share, group = vendor2)) +
  geom_col(aes(fill = vendor2), position = "fill") +
  geom_label(aes(x = 1.5, label = label, y = label_pos)) + 
  paletteer::scale_fill_paletteer_d("palettetown::lapras") +
  coord_polar(theta = "y") +
  guides(fill = "none") +
  theme_void()
p

… and the donut

We just simply add a white circle (i.e. same color as the background) on top of the pie chart. Voilà, a donut chart. Adjust donut_hole_width to change the size of the inner ring. A value of 0 will result in a pie chart, a value of 1.5 or greater will cover the whole pie chart.

Inside the inner ring we display the current month using geom_text().

# adjust the size of the inner ring
donut_hole_width <- 0.75

p + 
  annotate("rect", xmin = 0, xmax = donut_hole_width, ymin = -Inf, ymax = Inf,
           fill = "white") +
  geom_text(aes(x = 0, y = 0, label = format(date, "%B\n%Y")), stat = "unique",
            size = 8)

First animation

p_donut <- 
  df_long %>% 
  mutate(vendor2 = factor(vendor2, levels = unique(df_long$vendor2))) %>% 
  # now we need to calculate the label position within each month
  group_by(date) %>% 
  arrange(desc(vendor2), .by_group = TRUE) %>% 
  mutate(
    label_pos = cumsum(market_share) / sum(market_share) 
    - 0.5 * market_share / sum(market_share),
    label = sprintf("%s\n%s %%", vendor2, 
                    scales::number(market_share, accuracy = 0.1)),
    label = fct_reorder(label, market_share)) %>% 
  ungroup() %>% 
  ggplot(aes(x = 1, market_share, group = vendor2)) +
  geom_col(aes(fill = vendor2), position = "fill") +
  ggrepel::geom_text_repel(aes(x = 1.5, label = label, y = label_pos),
                hjust = 0, family = "Fira Sans", segment.size = 0.3,
                min.segment.length = 0, nudge_x = 0.3, point.padding = 1e-05,
                label.padding = 0.3, color = "white") +
  # semi-transparent ring
  annotate("rect", xmin = 0, xmax = donut_hole_width + 0.15, ymin = -Inf, ymax = Inf,
           fill = alpha("grey4", 0.25)) +
  # inner ring
  annotate("rect", xmin = 0, xmax = donut_hole_width, ymin = -Inf, ymax = Inf,
           fill = "grey4") +
  geom_richtext(
    aes(
      x = 0, y = 0,
      label = sprintf(
        "<span style='color: grey80'>%s</span><br>
        <span style='font-size: 40pt'>%s</span>", 
          format(date, "%B"), year(date))), 
                stat = "unique", size = 8, family = "Fira Sans SemiBold", color = "white",
                fill = NA, label.size = 0, lineheight = 1.67) +
  paletteer::scale_fill_paletteer_d("palettetown::lapras") +
  coord_polar(theta = "y") +
  guides(fill = "none", color = "none") +
  labs(
    title = "Mobile phone market 2010-2022",
    subtitle = "Market share of mobile phone vendors"
  ) + 
  theme_void(base_family = "Fira Sans") + 
  theme(
    plot.background = element_rect(color = NA, fill = "grey4"),
    plot.margin = margin(10, 10, 10, 10),
    text = element_text(color = "grey80"),
    plot.title = element_text(
      family = "Fira Sans SemiBold", color = "white", size = 16),
    plot.title.position = "plot"
  )
## Warning: Ignoring unknown parameters: label.padding
# p_anim <- p_donut +
#   transition_states(date)
# 
# anim <- animate(p_anim, res = 100, width = 720, height = 640, fps = 12,
#                 duration = 60)
# anim_save("animated-donut-chart.gif", anim)